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Abstract — This paper discovers fundamental principles of the 
backoff process that governs the performance of IEEE 802.11. A 
simplistic principle founded upon regular variation theory is that 
the backoff time has a truncated Pareto-type tail distribution with 
an exponent of (log 7)/ log m (m is the multiplicative factor 
and 7 is the collision probability). This reveals that the per-node 
backoff process is heavy- tailed in the strict sense for 7 > 1/m^, 
and paves the way for the following unifying result. 

The state-of-the-art theory on the superposition of the heavy- 
tailed processes is applied to establish a dichotomy exhibited 
by the aggregate backoff process, putting emphasis on the 
importance of time-scales on which we view the backoff processes. 
While the aggregation on normal time-scales leads to a Poisson 
process, it is approximated by a new limiting process possess- 
ing long-range dependence (LRD) on coarse time-scales. This 
dichotomy turns out to be instrumental in formulating short-term 
fairness, extending existing formulas to arbitrary population, and 
to elucidate the absence of LRD in practical situations. A refined 
wavelet analysis is conducted to strengthen this argument. 

Index Terms — Point process theory, regular variation theory, 
mean field theory. 



I. Introduction 

Since its introduction, tiie performance of IEEE 802.11 
has attracted a lot of research attention and the center of 
the attention has been the throughput ||6|, 1271 . Recently, 
other critical performance aspects of 802.11 also burst onto 
the scene, which include short-term fairness lfT2l . Il26l and 
delay 1381 . It goes without saying that there has been a 
phenomenal growth of Skype and IPTV users lfT4l , ifTSi and it 
is reported in ||23l that an ever-increasing percentage of these 
users connects to the Internet through wireless connections 
in US. Remarkably, it is found in ifTSll that jitter is more 
negatively correlated with Skype call duration than delay, i.e., 
Skype users tend to hang up their calls earlier with large jitters. 
This finding empirically testifies large jitter of access networks 
annoys Skype users, let alone QoS (quality of service). This 
quantified dissatisfaction of users provides a motivation for 
a thorough understanding of delay and jitter performance in 
802.11. 
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For throughput analysis, Kumar et al, in the seminal paper 
||27,|. axiomized several remarkable observations based on a 
fixed point equation (FPE), advancing the state of the art 
to more systematic models and paving the way for more 
comprehensive understanding of 802.11. Above all, one of the 
key findings of ll27l . already adopted in the field ll28l . 1341 . 
is that the full interference modefl, also called the single- 
cell model I27I . in 802.11 networks leads to the backoff 
synchrony property ISTl which implies the backoff process 
can be completely separated and analyzed through the FPE 
technique. Another observation in l27l was that if the collision 
probability 7 is constant, one can derive the so-called Bianchi's 
formula by appealing to renewal reward theorem ITSll . without 
the Markov chain analysis in Q. 

An intriguing notion, called short-term fairness, has been 
introduced in some recent works 15], IT2I . l26l . defining 
P[2;|C] as the probability that other nodes transmit z packets 
while a tagged node is transmitting C, packets. It can be 
easily seen that this notion pertains to a purely backoff -related 
argument also owing to the backoff synchrony property in 
the full interference model l27l . The two papers Q, IT2I . in 
the course of deriving equations for P[z|C]i assumed that the 
summation of the backoff values generated per packet, which 
we denote by Vl, is uniformly and exponentially distributed, 
respectively. Specifically, despite the same situation where two 
nodes contend for the medium, the former 15] assumed that Q. 
is uniformly distributed because the initial backoff is uniformly 
distributed over the set {0, 1, • • • , 26o — 1} where 26o is the 
initial contention window and observed in [5^ Fig. 2] that this 
assumption leads to a good match between the expression 
P[z|C] derived under the uniform assumption on and the 
testbed data measured in their experiments, while the latter 
IT2II also observed in Il2l Fig. 5(a)] that the testbed data 
measured in their experiments closely match the expression 
P[Z|C] derived under the the exponential assumption on fl: 
Ql:"What makes two different observations?" (to be an- 
swered in Section HiHi 

In addition, the two works 15], IT2I acquired the expression 
of P[2:|(^] only for the two node case. A more general formula 
for arbitrary number of nodes should deepen our appreciation 
of short-term fairness. It is natural to ask the following 
pertinent questions: 
Q2: "Can we develop a general model for short-term fair- 
ness?" (to be answered in Corollaries [T] (& |2| 

In proportion as people take a growing interest in the 

' In the full interference or single-cell model, every node interferes with the 
rest of the nodes, i.e., its corresponding interference graph is fully connected. 
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delay performance of 802.11, the number of fundamental 
questions that we face increases. In jl], it was argued based 
on simulation results that the access delay in 802.11 closely 
follows a Poisson distribution. They have shown that the 
number of successful packet transmissions by any node in the 
network over a time interval has a probability distribution that 
is close to Poisson by an upper bounded distribution distance. 
This raises an intriguing question: 

Q3:"/i there a Poissonian property? If yes, what is the 
cause?" (to be answered in Theorem [Til 

Another case in point is found in a recent work ll34ll that 
extends the access delay analysis in the seminal paper of 
Kwak et al. Il28l and makes an attempt at analyzing higher 
order moments by applying the FPE technique. One interesting 
finding in 1341 is that the access delay has a wide-sense 
heavy-tailed distribution [34. Theorem 1] which means that 
its moment generating function e*^/(a;)dx is oo, Vt > 0, 
where f{x) is the corresponding pdf (probability density 
function) ll32l . One should be careful in interpreting this find- 
ing because the wide-sense heavy-tailedness does not imply 
strict sense heavy-tailedness, which roughly means the ccdf 
(complementary cumulative distribution function) is of Pareto- 
type ifTTl with an exponent over (—2,0). In fact, there are 
lots of distributions, namely, lognormal, Pareto, Cauchy and 
Weibull distributions, which belong to the class of wide-sense 
heavy-tailed distributions. Consequently, the discussion poses 
the following challenge which is undoubtedly a tantalizing 
question. 

Q4: "What is the distribution type of the delay-related vari- 
ables ? " (to be answered in Theorems |2] & |3]l 

Finally, it is, perhaps, surprising that long-range dependence 
of 802. 1 1 has not been rigorously analyzed even for the single 
node case, not to mention the aggregate process of many 
nodes. One minor contribution of this paper is that we prove 
in Theorem [3] that the individual arrival process (consisting 
of successful transmissions of one node) can be viewed as 
a renewal process with heavy-tailed inter-arrival times, which 
implies that the individual arrival process possesses long-range 
dependence simply by appealing to ll30l . 

However, for the superposition arrival process (consisting of 
successful transmissions of all nodes), there is no clear answer. 
For example, Tickoo and Sikdar ||37l conjectured the absence 
of long-range dependence of aggregate total load, which we 
call superposition arrival process. It is remarkable that the 
absence of long-range dependence has been also supported 
through empirical analysis such as wavelet-based method ||2l 
by Veres and Boda |[39| in the context of TCP flows in 
wired networks. Since there is an analogy between the backoff 
mechanisms adopted by 802.11 and TCP (in wired networks) 
in that 

1) both of them adopt backoff schemes (802.11) or retrans- 
mission scheme (TCP) where the probability of these 
events is either the collision probability (802.11) or the 
packet drop probability in router buffers (TCP), 

2) the mean of the backoff (contention window in 802.11) 
or retransmission time (timeout in TCP) doubles for each 
backoff or retransmission. 



one might wonder if there is a fundamental reason that 
elucidates these observations. 

Q5: "Does the aggregate transmission process possess long- 
range dependence? If yes, why is it seldom observed?" (to be 
answered in Theorem |4] and Section IVIII ) 

The focus of this paper is on the backoff process in 802.11, 
since it plays the central role in quantifying the performance 
of 802.11 1271 . For example, to grasp the heart of the de- 
lay properties, the backoff value distribution in 802.11 DCF 
(distributed coordination function) mode can be used as a 
surrogate for the access delay f2M- As discussed above, the 
throughput performance and short-term fairness performance 
also depend on the backoff process and are particularly af- 
fected by the backoff synchrony property. Essentially, once the 
backoff distribution is obtained, various performance aspects 
can be analyzed. 

A. Contributions of this work 

This paper discovers fundamental principles of the backoff 
process and provides answers to the open questions high- 
lighted above, which constitute the contributions of the paper. 
Particularly, it turns out that we find out the answers to most 
aforementioned questions Q2-Q5 in the course of deriving the 
following two principles based on a new methodology, i.e., 
point process approach. 

• Power- tail principle: The per-packet backoff time distri- 
bution has a slowly-varying power-tail (Theorem |3]l. 

• Dichotomy of aggregation: Depending on the time- 
scales on which the backoff processes are aggregated, 
the resultant process becomes either Poissonian or a new 
process (Theorems [T] & HI . 

The power-tail principle, which is derivable only after we 
accumulate a store of knowledge (Section Hill Lemma [U 
and Theorem 111, characterizes the backoff distribution in a 
tractable and simplistic way, owing to regular variation theory, 
answering Q4. The dichotomy of aggregation implies that, 
when we view the aggregate process on normal time-scales, 
owing to the tendency of each component process to become 
sparse as population grows, we observe only a Poissonian as 
its marginal distribution. However, viewed on coarse time- 
scales, the aggregate process is identified as a long-range 
dependent process. This rigid dichotomy is instrumental in 
finding answers to Q2, Q3 and Q5, and expatiates upon the 
coexistence of contrary properties suggested by Q3 and Q5. 
All the theorems in the paper are closely linked with each 
other, forming a solid framework for the performance analysis 
of 802.11. These results help us to get the complex details 
of the backoff process in 802.11 into perspective under one 
framework. 

The rest of the paper is organized as follows. In Section 
HI! we revisit the Bianchi's formula along with a survey of 
recent advances in mean field theory, with which the analysis 
of the backoff process at one node can be decoupled from other 
nodes. In Section Hill we present the exact distribution of per- 
packet backoff. We establish in Section |IVl that the aggregate 
backoff process can be approximated by a Poisson process 
under the large population regime. In Section |Vl we extend 
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to asymptotic analysis and prove the power-tail principle. In 
Section |Vl] we first propose a new process approximation on 
a coarse time scale, which is then applied to formulate short- 
term fairness and to identify long-range dependence. After 
conducting a wavelet analysis on long-range dependence in 
Section IVIII we conclude this paper. 

II. BiANCHi's Formula Revisited 

The backoff process in 802.11 is governed by a few rules 
if the duration of per-stage backoff is taken to be exponential: 
(i) every node in backoff stage k attempts transmission with 
probability pk for every time-slot; (ii) if it succeeds, k changes 
to 0; (iii) otherwise, k changes to (fc + 1) mod (K + 1) 
where K is the index of the highest backoff stage. Markov 
chain models, which have been widely used in describing 
complex systems including 802.11, however, very often lead 
to excessive complications as discussed in Section I] In this 
section, we present a surrogate tool for the analysis, mean 
field theory. It is noteworthy that the rules used in 802.11, 
i.e., (i)-(iii), closely resemble the mean field equations laid 
out below. 

A. Basic Operation of DCF Mode 

Time is slotted. Each node following the randomized access 
procedure of 802.11 distributed coordination function (DCF) 
generates a backoff value after receiving the Short Inter- Frame 
Space (SIFS) if it has a packet to send. This backoff value is 
uniformly distributed over {0, 1, • • • , 26o— 1} (or {1, 2, • • • , 
2foo}) where 26o is the initial contention window. 

Whenever the medium is idle for the duration of a Dis- 
tributed Inter-Frame Space (DIFS), a node unfreezes (starts) 
its countdown procedure of the backoff and decrements the 
backoff by one per every time-slot. It freezes the countdown 
procedure as soon as the medium becomes busy. There ex- 
ist K + 1 backoff stages whose indices belong to the set 
{0, 1,-- - where we assume K > 0. If two or more 
wireless nodes finish their countdowns at the same time-slot, 
there occurs a collision between RTS (ready to send) packets 
if the CSMA/CA (carrier sense multiple access with collision 
avoidance) is implemented, otherwise two data packets collide 
with each other. If there is a collision, each node who partici- 
pated in the collision multiplies its contention window by the 
multiplicative factor m. In other words, each node changes its 
backoff stage index k to fc + 1 and adopts a new contention 
window 2bk+i ~ 2m'^+^ - bo. If k + 1 is greater than the index 
of the highest backoff stage number, K, the node steps back 
into the initial backoff stage whose contention window is set 
to 2&0- In the IEEE 802.11b standard, m ^ 2, K ^ 6 (7 
attempts per packet), and 2bo = 32 are used. 

This work focuses on the performance of single-cell 802.11 
networks where it is sufficient to analyze the backoff process 
in order to investigate the performance of single-cell networks 

B. The Bianchi's Formula 

In performance analysis of 802.11, Bianchi's formula and 
its many variants are probably the most known ||6l, 1271 . 



Assuming that there are N nodes, the Bianchi's 
formula can be written compactly in a more general fixed point 
equation (FPE) form: 



P 



1 ■ 



9fc 



(FPEl) 



(FPE2) 



where p and 7 respectively designate the average attempt rate 
and collision probability of every node at each time-slot. The 
attempt probability in backoff stage k is denoted by and 
defined as the inverse of the mean contention window, i.e., 
qk = 2/{2bk - 1). It satisfies < % < 1 as 6^ > 1. Note that 
Bianchi's formula holds under the well-known assumption: 

A.l All the transmission queues of nodes are saturated. 

Exactly under which condition (IFPElb holds is recently 
being rediscovered with rigorous mathematical arguments ||4l, 
ifTTl . II35I . which, sometimes called mean field approxima- 
tion. This fundamental approach was originally developed by 
Bordenave et al. ifTTj and Sharma et al. Il35l . Remarkably, 
Bordenave et al. IfTTl adopted a generalized particle interaction 
model which encompasses Markovian evolution of the system 
other than particles at the same time. Benaim and Le Boudec 
||4l overcame some limitations of the model IfTTl . broadening 
its applicability. The main result here is that, as the number of 
particles goes to infinity, i.e., N 00, the state distribution 
of every node evolves according to a set of X + 1 dimensional 
nonlinear ordinary differential equations under an appropriate 
scaling of time. Benaim and Le Boudec pl also observed 
that decoupling approximation represented by ( IFPElb does 
not hold if the differential equations does not have a unique 
globally attractor 

Remarkably, Bordenave et al. IfTTl have proven that the 
differential equations are globally stable if if = cx) and the re- 
scaled attempt probability Qk := Nqk satisfies Qk+i = Qk/2 
with Qo < In 2. In the meantime, Sharma et al. lf35l obtained 
a result for K = 1 and mentioned the difficulty to go beyond. 
However, the case for other finite K has remained to be proved 
II] pp.833]. Recently, we solved this issue to a large extent lfT6l 
by proving that, for finite K, a simplistic condition Qfc < 1 
(or qk < 1/N) for all k e {0, • • • ,K} guarantees the global 
stability of the differential equations as well as the uniqueness 
of the solutions to the Bianchi's formula. As we discussed 
in lfT6l . there are still many outstanding problems upon the 
stability of the associated differential equations. 

While one of the aims of these efforts g], HTl, lUSl, ES] 
is to identify the fundamental conditions under which the 
collision probability is deterministic and time-invariant for 
large population (N = 00), once we assume the collision 
probability is such for N < 00, the demonstration of the 
formula (IFPElb is shown to be straightforward 1271 ■ That is 
to say, we need to make the following simple assumption. 

A.2 For each node, conditional upon its transmission attempt, 
the collision events form an i.i.d. sequence, which is 
independent from other nodes. 

The observation in |27| was that, under the above assumptions, 
one can easily derive ( IFPElb by appealing to renewal reward 
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theorem |fT3ll . without the Markov chain analysis in Thus 
from now on, the attempt probability is given by ( IFPEll ). As 
a by-product, we can also see that the distribution of backoff 
stages, which we denote by (f>k, k G {0, • • • ,K}, takes the 
following form 



<Pk ^ — 

qk 



1 



(1) 



The expression of the collision probability ( IFPE2b was first 
used in i27i Section IV]. A similar expression was also used 
in im, ifTDl under the intensity scaling, which means that the 
attempt probability of every node in any backoff stage is of the 
order of 1/iV. We use ( IFPE2l i instead of its original version in 
161 because, as argued in ID, ifTDl . the approximation provided 
by ( IFPE2I ) is well founded on a mean field result. Lastly, it is 
also noteworthy that the analysis in Section |III] and |IV] does 
not depend on whether K is finite or not. 

III. Backoff Analysis 

The backoff value distribution and the backoff stage distri- 
bution should not be confused in meaning. While the latter is 
the distribution of the backoff stage of a node, the former is 
the distribution of the backoff value generated for initiating the 
backoff countdown when the node has a packet to transmit. 
The backoff value distribution at backoff stage k has a discrete 
uniform pdf (probability density function) fk{-) on the integers 
{0, 1, • • • , 2bk — l} with mean l/q^ ~ (26^—1 )/2 and variance 



"^^'^ "3 12~ 3 2b,- 1 ql 



^k 



where the pre-factor is denoted by to simplify the exposition 
in the current section. Note that limt,^_j.oo w| = 1/3. 

Let il and /si(-) respectively denote the sum of the backoff 
values generated for a packet, and its pdf. Also denote by fj 
its mean and afj its variance. It should be clear that the sum of 
the backoff values generated for a packet 17 which we baptize 
in this paper per-packet backoff can be formally defined as a 
compound random variable: 



(2) 



where is a random variable denoting the backoff value 
generated at the fcth backoff stage, for a packet of a tagged 
node, and k is also a random variable designating the highest 
backoff stage reached by the packet. 

The probability that the fcth backoff stage is reached during 
the backoff duration for a packet can be computed as j'' 
irrespective of the backoff distribution at any backoff stage. 
Hence we have 

P[K = fc] =7*^-7*^+1, Vfce{0,--- 

and P[k = K] — 7^^. From Bayes' theorem, /o(-) becomes: 



fn{x) = Y.k=o I K = k)-P[K^ k] 



(3) 



where fn{x \ k = k) denotes the sum of the backoff values 
from 0th to fcth stages for a given fc. Applying the fact that 



the sum of fc random variables with pdfs fo{-), - ■ ■ , fk{ ) has 
a pdf of the convolution of the pdfs yields 

fn{x) = /*^(x)7^ + (1 - 7) EtV r^'i^)!" (4) 

where f*^{-) := (/o *•••*./&)(•) is the convolution of fc + 1 
functions. In a similar way, Q, can be computed from (|2]i: 



K f^k 



E 



fc=0 ' 



-Eto(Eko^)-P[--^]- 



(5) 



By manipulating Q combined with the expression of P[k 
fc], it is easy to see that 

K 



7_ 

qk 



(6) 



In addition, using E[i?^] 
of can be rearranged as 



(1 + the second moment 



^2 



-V"^ E 



{Y.i=,Bkr 

Efc'=o ^fc' 



P[k = fc] 



(7) 



Etc + 2 



, k'=0 



1=1 j=0 



= (Eto S (1 + -D) + 2 (Eti i Et; i 

The above equalities can be easily verified by rearranging ^ 
and ([8]l. Moreover, it is shown in Appendix |A] that, if q, — 



2/{2bom — 1) as in the standard. 



(Eto(^om'= - 1/2)7^ 



(T^/fi simplifies to 



(9) 



where 

4 = {bom'' - i) 7'^- { + vl) (bom'^ _ i) - fc - ^} 

Remark 1 F[TJl The result puts forward an alternative 
viewpoint. We can view the backoff process reflecting the 
collision effect among nodes as if there is no collision at all 
and the per-packet backoff for every node has a distribution 
with mean Vl and CV (or equivalently variance crf^. 

K[TJ2 [Answer to Ql] Consider the case = 2. It can 
be computed from (|9]l that Vl is approximately uniformly 
distributed in 802.11b while it is exponentially distributed in 
802.1 la/g in the sense that vn ~ 0.7 (though slightly larger 
than 1/a/3) and vn ~ 1.0, respectively, mainly due to different 
initial contention windows (2&o = 32 in 802.1 lb and 2&o = 16 
in 802.1 la/g). This is the reason why they ||5l, lfT2l observed 
that their testbed data closely match the expressions of inter- 
transmission probability P[Z|C], which were derived under 
their respective assumptions. Note that we communicated with 
the first author of IIT2I to verify the protocol (802. 1 Ig) used in 
their testbed. We will formally define P[Z\Q] in Section IVLAl 

To verify the analysis, simulations have been conducted. We 
have used yis-2 version 2.33 with its built-in 802.11 module 
and the parameter set of 802.11b, i.e., m = 2 and 2bo = 32, 
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collision probability (y) 

Fig. 1. Per-packet backoff CV vq vs. collision probability 7 for K = 
6, 15, 00; and Af = 2, ■ • • , 100. 

except that K is varied to observe the asymptotic property. All 
simulations use a 3000s warm-up period and all quantities are 
measured over the next 320,000s (sa 90h). 

Fig. [U presents the per-packet CV vq, computed from (|9]l, 
(IFPEll i and ( IFPE2b . and compared with the simulation results. 
The figure shows a good match between them. In the figure, 
the intersecting points of contours of K and N at each level 
decide Vfi and 7 simultaneously. As is predicted by (|9]l, vq 
goes to 00 as K goes to 00 for 7 > 1/m? = 0.25. It is 
remarkable that for a given TV > 9 (iV > 5 for 802.1 la/g), vq 
is extremely sensitive to K, forming a striking contrast with 
the insensitivity of 7 to K. 

The discrepancy between analysis and simulation study is 
partly due to reduced contention effect, which is a less-known 
subtlety of DCF behavior discovered by Bianchi et al. ||7| and 
is shown through simulations to be a factor of error by Sakurai 
and Vu El. 

IV. Point Process Approach: Poissonian Insights 

A basic property of per-packet backoff f2 discovered by 
Kwak et al. 128. Theorem 1] and later strengthened by Kumar 
et al. 1271 Theorem 7.2] is that the mean of per-packet backoff 
is proportional to the population, i.e., Q, ~ Q{N). This turns 
out to play a key role in our point process approach in this 
section. 

A. Justification of Point Process Approach 

In order to justify our point process approach, we need 
to show that the backoff process of each node has nonzero 
intensity, i.e., Cl = E[0] is finite. Though, for finite K, this 
is self-evident from the form of (|6]l, we need to assume the 
following to prove Vl < 00 for K ^ 00. 

A.3 Qk = 2/{2bom,'' - 1) for all fc e {0, • • • , K}, and to > 1. 

Under this assumption we can prove the following lemma 
which assures us that Cl is finite whether K is finite or not. 
We also would like to point out that a part of the proof of 
II27I Theorem 7.2], which corresponds to the case ii' = 00 of 



Lemma [T] in our work, has a flaw because they should have 
proven 7 < 1/m before using X]feLo("^7)'^ = 1/(1 ^ TO7). 
Lemma 1 (Mean Exists) 

Under the above assumption, there exists a finite Kq such that 
7 < 1/to and 7 is decreasing in K. This implies: 

• there exist A'o such that 7 < 1/to for all K > Kq 
including K = 00, 

• the mean = E[51] exists for K ^ 00. 

Proof: Suppose 7 > 1/to. Then we have from (IFPEll) 
and Qk = 2/{2hQm^ — 1) that, for any e > 0, there exists 
K\ such that p < e for all K > Ki. In the meantime, from 
1 — < X, we also have 7 < {N — l)p < {N — l)e. This 
contradicts 7 > 1/m, implying that there must exist Kq such 
that 7 < 1/to for K — Kq. 

Denote the right-hand side of (IFPElb by P{K). Since the 
right-hand side of (IFPE2I) is increasing in p and P{K) is 
nonincreasing in 7 from 1271 Lemma 5.1], 1 — e~^^~^)^*^^-' 
is nonincreasing in 7. Therefore, it suffices to show that 

1 _ e-(^-i)P(^o+i) < I _ g-(w-i)p(Ko)^ 

or equivalently P(A'o + 1) < P(A'o), for all 7 < 1/to. After 
some manipulation and some intricate factorization, it can be 
verified that P(/io) - P(A'o + 1) takes the form: 

which is greater than zero for m > 1, implying that the 
solution 7* of ( I EPF 11 1 and ( IFPE2b for K = Kq + 1 is smaller 
than that for K ~ Kq. Applying mathematical induction 
completes the proof. Also note that this implies 7 < 1/to 
for any K > Kq. 

For the case K = 00, since we have shown that 7 < 1/to 
is decreasing in K for all K > Kq, it follows from l33l 
Theorem 3.14] that as K goes to infinity, 7 should converge 
to 7 < 1/to. The existence of follows from (|6]l. ■ 

Since m > 1 guarantees that there exists Kq such that 
7 < 1/m for all K > Kq, it can be seen from ([T]) that, 
for the case of K = 00, m > 1 is also a sufficient condition 
for the existence of k such that <f)k > (pk+i for all k > k, 
i.e., the average number of nodes in backoff stage k is larger 
than that in backoff stage fc + 1. This corresponds to the 
tightness condition of (j)k, which prevents a node from escaping 
to infinite backoff stage i8|. The fact that the condition m > 1 
prevents a node from escaping to infinite backoff stage appears 
to be in best agreement with our usual intuition. 

B. Essential Assumption 

To establish Poisson limit result in Theorem[T]and to justify 
point process approach in the remaining sections, we need the 
following essential assumption. 

A.4 Per-stage backoff distribution fk{-) is a uniform continu- 
ous function. It also means = 1/ a/3. 
Recall that /n(-) is expressed by (H, hence now it is a 
weighted sum of convolutions of continuous pdfs fk{-) where 
the weight for each convolution function /*'"'(•) = {fa * ■ ■ ■ * 
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fk){-) is (1 — 7)7'", which is a function of 7. As we noted in 
Remark [T] fn{-) reflects the colhsion effect through 7 which 
determines how much fn{-) is dispersed. 
On continuity assumption: Denote by D"{t) the number of 
cumulative per-node successful transmissions until time-slot t. 
Formally, D"'{t) is discrete-time renewal process that counts 
the number of arrivals during the interval [0,t] where the 
inter-arrival times are i.i.d. copies of discrete random variable 
ri. Consider superposition process D{t) := '^n=i-^^i^)- ^ 
subtlety in 802.11 is that there may be no intervening backoff 
time-slot between two consecutive successful transmissions. 
More precisely, at the beginning of a backoff time-slot, 
if the transmission attempts of nodes lead to a successful 
transmission, the time-slot is rendered unused, meaning that 
the time-slot is reused after the successful transmission. The 
same subtlety applies to collision events. Simply suppose the 
probability that a successful transmission (or a collision event) 
occurs at the beginning of a time-slot converges to Ps (or Pc) 
as — > 00. Putting 

P{x) := P[limw_,oo D{t + 1) - D{t) = x], x G {0, 1, • • • }, 
we can see from the subtlety that 

Pix + 1) = Pix) ■ J:Zo PhPs = T^P{^)- 
Because Y1^=q P{^) = 1- we have a geometric distribution 



P{x) 



Ps 
l-Pc 



Ps 
l-Pc 



{o,i,---} 



hence the limiting (as N — s- 00) distribution of cumulative 
process D{t) for arbitrary integer t takes a Pascal (negative 
binomial) distributioiJl. This fact can be exploited for a more 
accurate approximation. A simpler approximation at the cost 
of accuracy is to be presented in Theorem [T] 

Once again, the continuity assumption turns out unavoidable 
in Section [V] because regular variation theory ifTOl exploited 
by Theorem [3] is not well developed for discrete functions. 
The uniform distribution assumption of fk{-) was made only 
to simplify the exposition of Theorems |2] and [3] in Section |V] 

C. Poisson Process Approximation 

We can now view the backoff procedure of node n as 
a stationary simple renewal process A'^{t) that counts the 
number of arrivals during the interval (0, t] where the jth inter- 
arrival times, T" — TJLi, are given by the i.i.d. copies of the 
continuous random variable Q.. Then the backoff procedure of 
all nodes can be regarded as a superposition of N statistically 
identical renewal processes, i.e., 

A{t) :=Eti^"(t)- 

It should be remarked that, if one or more component pro- 
cesses are not Poisson, the superposition process A{t) is 
not renewal, and even if the inter-arrival times of A{t) are 
identically distributed, they are not independent jS). 

In the following, we present a novel way to tackle this 
analytical difficulty caused by the dependence among the inter- 
arrival times of the superposition process. The key observation 

"Sakurai and Vu Ii34| Section III-B] assumed D{t) is a Bernoulli process. 
This simplification was justified by the reduced contention effect (7). 



is that the entropy of the superposition point process A{t) 
increases with N, which is implied by the following known 
result nil Proposition 11.2.VI]. 

Lemma 2 (Poisson Limit for Superposition) 

Let S(t) denote the point process obtained by superposing M 
independent replicates B"^{t), to G {1, • • • , m}, of a simple 
stationary point process with intensity A and dilating the time- 
scale by a factor M. Formally speaking. 



(10) 



Then as M — > 00, ^(t) converges weakly to a Poisson process 
with the intensity A. 

Now it follows from the basic property 1271 Theorem 7.2] 
for K = 00 that the mean inter-arrival time of A"(t), fj, is of 
order N. Therefore, there must exist a point process 



lim A'^(Nt) with intensity A = lim N/Q. 



where intensity A does not scale with N and we have 
B"{t/N) w as N goes to 00. This in turn implies 



which has the same form of dTol i. Applying Lemma |2l to the 
above equation leads to the following theorem. 

Theorem 1 (Dichotomy of Aggregation: First Part) 

Suppose = Q{N). Then the superposition process 
"^n^i A^^it) converges weakly to a Poisson process as N ^ 
00. 

Remark 2 This result states that the Poissonian nature is 
inherent in the backoff process of 802.11 and provides an 
answer to Q3. 

KEII The reason we do not require K ^ 00: Recalling 
our discussion at the beginning of this section, we can see that 

„ |27l Theorem 7.21 ^ r^/nn Theorem [T] „ . 

A = 00 =^ \l ~ B(A') Poisson. 

If we require K = oo instead of $7 = QiN), the above 
theorem would look simpler, but it would not be applicable 
for the case K < co. Even if K is finite, the crucial scaling 
condition = 0(iV) holds for a wide range of N , as hinted 
by previous works (See the simulation result with a practical 
parameter set in 1341 Figures 2 and 5]). However, for extremely 
large N , the scaling becomes ~ 0(1). 

K^2 From a different angle, the backoff procedure of 
802.11 along with its setting K = 6 is intentionally designed 
so that the successful attempt intensity of each node l/fj is 
kept being of the order of \/N for a wide range of N, by 
allowing enough number of backoffs for each packet. 

What is the premise of Poisson limit?: The question remains 
whether the approximation is precise even for t = 00. As Whitt 
discussed in PTl Chapter 9.8], the underlying assumption of 
the Poisson limit theorem (Lemma |2]i is that t is finite. In the 
meantime, the basic premise of the Poisson limit theorem is 
that the component process should become sparse (f2 = 

0(^)) BQI pp.83]. If we allow f — )- cx) at the same time as 
N — > 00, v4"(t) may not remain sparse. This is essentially why 
we must adopt an another approximation in Section IVTl where 
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t = Q{N). In the light of these points, the above theorem 
provides a natural approximation of the backoff processes on 
normal time-scale, as compared with the other approximation 
in Section |VT] on coarse time-scales. 

V. Asymptotic Analysis 

A stochastic process with infinite variance and self- 
similarity exhibits phenomena called Noah effect and Joseph 
effect, respectively, in Mandelbrot's terminology 1361 , HTI . 
Noah and Joseph effects refer to the biblical figures Noah, 
who experienced an extreme flood - exceptionally large values 
- and, Joseph, who experienced long periods of plenty and 
famine - self-similarity or strong positive dependence. This 
section lifts the veil to discover these effects and to explain 
their influences on the backoff process in 802.11. We have 
not assumed K ~ oo because all results derived so far are 
appUcable if either of finite and infinite K is used (See Remark 
|2] also). However, all results derived in this section require 
K = oo, hence we formally assume the following. 

A.5 There are inflnite backoff stages, i.e., K = oo. 

A. Moment Analysis 

We introduce the notion of a wide-sense heavy-tailed dis- 
tribution borrowed from [[32l . We call a pdf f{x) wide-sense 
heavy-tailed if its moment generating function is infinite, i.e., 

j^c*-' f{x)dx = oo, Vt > 0. 

We now characterize the existence of all fractional moments 
of il. Let us define 



a := 



- (log 7) /log m 



where a > 1 is satisfied by Lemma [T] Also it is remarkable 
that Sakurai and Vu 1341 established a similar result for integer 
moments. Note however that we cannot prove Theorem [3] 
without the following extended result for fractional moments. 

Theorem 2 (Existence of Fractional Moments) 

The per-packet backoff Vl has a wide-sense heavy-tailed dis- 
tribution. In addition, its cth moment £[17"^] is 

• infinite if c > a, 

• and finite if < c < a. 

Proof: First we note a ~ — (log7)/logm is equivalent to 
m"7 = 1. It also follows from Lemma [T] that a> 1. Letting 
c be any real number such that c > a, we have to'^7 > 1. 
Then the cth moment of Vt, E[f2'^], can be computed as 



Er=oE 



> 



P[K = k] 
^ P[k = k] 



where the first inequality holds by Holder's inequality for 
expectations, i.e., {E[X]Y < E[X'^], and the second inequality 



follows from c > 1. Hence, from the last expression, we have 
E[il^] 00 as if ^ 00. Note that c is real. Since there exist 
infinite moments, 51 has a wide-sense heavy-tailed distribution. 
Now consider the cth moment for 1 < c < a. 



E[(ELo5^-)1=E' 



fe=0 



(Efe'=o 



P\k 



^ (2fco)^ v^oo 

<gSEr=o(^+ir 



(2&om'°'-l)'= 



k'=0 (c+1) 



P[k 



P[k = k] 



(c+1) ^fe=OV'' I m"-! 
C^bprny- Y^oo (i, . i \c-l f ^c„,\k 



k] 

(11) 

(12) 

(13) 
(14) 



- (c+i)(,Xi) Er^o(fc+ir^(^'^7) 

where ( fTTT i can be obtained by applying original Holder's 
inequality, i.e.. 



Et=oi-fefc' < Et=oi 



Et=o(&^ 

(fT2] i can be verified by computing J ¥'fk'{b)dh where fk'{b) 
is a uniform pdf with mean bom^ — 1/2. ( fT3T l follows from 
P[k = k] < 7*^. Then it suffices to show that d'Alembert's 
ratio of the series (fT4] l is less than one. Recalling that to'^7 < 
m"7 = 1, we can see that 



lim 



(fc 



= to'^7 < 1. 



This estabhshes (fT4b is finite for K — 00, and completes the 
proof. ■ 

Remark 3 [Answer to Q4] This theorem reveals that ft is 
wide-sense heavy-tailed in the sense that not all of its moments 
exist, as Sakurai and Vu l34l Theorem 1] first noted. 

As shown in Fig. [T] the variance cr^ in 802.11b is not 
very large. Nevertheless, the statistics of fl certainly contain 
precursors of infinite-variance distributions, as shown in the 
next section. 

B. Strict-Sense Heavy-Tailedness: Tauberian Insights 

Although there has been some work to prove the wide-sense 
heavy-tailedness of the delay or backoff duration l34l and the 
power-law like behavior of access delays was identified only 
through simulations in a few works l34ll . l37l . to the best of 
our knowledge, none of them proved that the delay or backoff 
duration has a power-law tail. This quite intuitive property has 
not been established mainly due to the theoretical difficulties 
underlining the proof. It is important to note that this theorem 
is a prerequisite for mathematical analysis of Noah effect, 
which implies strict-sense heavy-tailedness. 

We would like to place particular emphasis on the following 
theorem for another reason. We note that some work 1211 . l39l 
considered the question whether a single long-lived TCP flow 
can generate traffic that exhibits long-range dependence (or, 
equivalently, asymptotical second-order self-similarity). It is 
significant that long-range dependence is a property which is 
automatically implied by heavy-tailed inter-arrival times l30l 
for the single flow (or node) case, irrespective of the context. 



That is, even a renewal process (no correlation of inter-arrival 
times) with heavy-tail distributed inter-arrival times generates 
long-range dependence in the counting process. In the light of 
this point, one do not need to conduct analyses of tremendous 
traffic traces if there is a solid mathematical work that can 
settle this kind of dispute. 

In the following theorem, we prove that the per-packet 
backoff distribution has a power tail by lighting upon the fact 
that the moment generating function has a recursive relation, 
and by applying the theory of regular variation ITOl and the 
less-known modified Tauberian theorem of Bingham & Doney 
||9l- For your own good, note that this theorem requires only 
K ^ oo, nothing about N . 

Theorem 3 (Power Tail Principl^l) 

The per-packet backoff Vl has a Pareto-type tail with an 
exponent of —a. Formally, 



(15) 



The notation f{x) ~ (7(0;) means Ivmx^oo f {x) / g{x) = 1, 
and l{x) is slowly varyingj. 

Remark 4 This principle, formulated in terms of the ccdf 
F^^{-), not only defines a fundamental characteristic of delay 
but also lays the groundwork for further analysis using regular 
variation theory. 

K[4jl [Answer to Q4] This clear-cut and simple result 
reveals the statistical attribute of Vl for any population N . It 
has a Pareto-type distribution whose exponent parameter is 
—a. Theorem [3] proves the strict-sense heavy-tailedness of 
fi for a < 2, and puts an end to the discussions in Section U 

K[4j2 This theorem dispenses the complicated convolution 
expression (|4]i and leads us to a simpler conclusion. The most 
representative distribution of backoff times O is a truncated 
Pareto-type distribution (though it must be slowly-varying), 
rather than uniform or exponential as observed in the simula- 
tion studies of l5], ifHl . 

K[4l3 The simplistic term in ( fTSl ) is irreplaceable with 
any other expressions, implying its pivotal role. For instance. 
Final Value Theorem tells nothing but fn (x) = 0. 

The ccdf of obtained through ns-2 simulations is plotted 
in Fig. [2] on a log-log scale where the estimated slopes a are 
compared with the analytical formulae a — — (log 7)/ log m, 
(IFPEll) and (lFPE2b . Observe that these simple formulae along 
with ( fTSI l provide a precise estimate for the tail distribution. 
Remarkably, even for K = 6, i.e., the value adopted in 
802.11b, the ccdf of fl can be accurately approximated by 
a truncated power-law tail. 

^The proof in fact requires a to be not an integer. For the complicated case 
when a is an integer, we refer to 1191 and 1101 Theorem 8.1.6]. However, 
since an integer a can be approximated for any small e > by a real number 
a such that |a — a| < e, we expect the result of Theorem[3]to be valid for 
all o > 0. 

*A function f(x) is called regularly varying 1101 at infinity of index p 
iff lima;_^oo /(Ax)//(x) = A'', VA > 0. For the special case p = 0, it is 
called slowly varying and usually denoted by £{x). For example, a positive 
constant, (logx)'^ for any real number e is a slowly varying function. A 
slowly varying function £{x) is dominated by any positive power function, 
i.e., lim:„^oo^(a:)/a;^ = 0, Ve > 0. 



a = 1.01 
n = 1.07 



K 10" 



Q = 1.76 
Q - 1.81 



simulalion: A'-IO 
simulation: ^:-15, A'-IO 

■ simulation: K=6, N=40 

■ simulation: AT— 15, A'— 40 
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per-packet backoff (.x) 



Fig. 2. Complementai'y cumulative distribution function F^{x) for K 
6,15; and N = 10,40. 



VI. Short-Term Fairness Analysis 

First of all, we cancel the assumption K = oo we made 
in Section [V] because we present in this section a new 
approximation for the superposition process and short-term 
fairness analysis, both of which will be applicable to both 
cases K < oo and K — oo. 



A. Inter-Transmission Probability 

The notion of short-term fairness 15], lfT2l . Il26l . defined 
as the distribution of successful transmissions of nodes for a 
finite time, has been getting the limelight due to its central 
role in quantifying the behavior of random access protocols 
over short time-scales and its close link to access delays. 
Among the set of nodes {1, • • • , N}, we tag node N, without 
loss of generality. Assume that the tagged node successfully 
transmitted a packet at time t = 0. Denote by Z„ the number 
of packets successfully transmitted by node n while the tagged 
node transmits ( packets. Recalling that counts the 

arrivals during the interval (0,t], we can see 

Zn := A"{t') where t' = min{t : (t) = C}- 

It is clear that Zpf = C, from the above definition. For short- 
term fairness analysis, we consider 



For the sake of convenience, we denote P[Z ^ z\Zm = C] by 
PTvi^lC]- We call the conditional probability Pjv[z|C] inter- 
transmission probability . In terms of the point processes 
it is equivalent to 



N\ 



where Vlj denotes the per-packet backoff for each jth packet 
of the tagged node N and are i.i.d. copies of Q,. 



9 



B. Intermediate Telecom Process on Coarse Time Scales 

The premise does not hold: Look into the above superposition 
process 

Hn^i ^''i^) where t = Recall the basic 

premise of Poisson limit theorem (Lemma |2]i is that each 
component process must become sparse as N grows. It is 
easy to see that this premise does not hold any longer here 
because t = X^^^i of order of C • fj in the sense 

that E[t] = 6(C^) VL is of order of N in most cases 
(See Remark |2]i. Therefore, we need a new approximation 
of the superposition process on coarse time-scales such that 
t = @{N). 

Before that, we epitomize theory of stable law ijTTl Chapter 
4] briefly only for the case a E (1, 2]. Denote by Sa{<J, (3, n) 
Levy a-stable laws whose four parameters are: the index a; the 
scale parameter cr; the skewness parameter /3; and the mean 
/i. If Xi, • • • ,X„ are i.i.d. copies of Sq((t, /?, /i), they satisfy 
the stability property which takes the following form 

E™ - m) = (^1 - m) 

where the notation = means equality in distribution. The case 
a = 2 is singular because we have 82(0-, = N(//, 2tT^) 
where /3 plays no role. However, for the rest of cases a E 
(1, 2), there is no closed form expression for its pdf. 

Since Leland et al. 1291 created a wave of interest in the 
self-similarity in the Internet, the probabilistic community 
has been concerned with the limit processes of aggregate 
renewal processes under different limit regimes. Here a point 
at issue was the order of limit operations, i.e., t 00 and 
N ^ 00. Recently, Kaj et al. ^ have estabUshed 

a fundamental connection between Noah effect and Joseph 
effect, elucidating the above issue as well. 
Aggregate Process on Coarse Time Scales: A premise of 1241 
Theorem 1] is that each component process should not become 
sparse as N grows, i.e., inter-arrival times not scaling with 
N. This premise is fully satisfied when we consider A'^{Clt) 
instead of In other words, we now view A"(t) on 

coarse time-scales r = Ctt. Also note that E[74"(ilt)] = t. 
Then applying l24l Theorem 1] yields to the following result 
which is applicable to various cases K — 00, K < 00, finite 
time (which must be large enough though), and infinite time. 

Theorem 4 (Dichotomy of Aggregation: Second Par^ 

Suppose, for K — 00, the inter-arrival times of A" (fit) 
has ccdf FQ{rtx) in ( fTSl l which does not vary with N. For 
K < 00, nothing is assumed. Define the centred superposition 
process 



the family of Intermediate Telecom process 12511 of index a 
whose cgf takes the form 



Then, as -> 00 and N 



we have 



^ c ■ Y„ Q , for K = 00, a e (1, 2), (16) 

4ii"^'«n-BW, ( ^°'"f:<°"' , (17) 

^iVC \ for A = cx), a G (2,00), ^ ' 

where the scaling constant c := {7Vn-"^(Ci^)}^/(""^VC 
B(-) is a standard Brownian motion, and Y„(-) belongs to 



log E [( 

+ /; (o«^ - 1 - ex) {c 



1 - er) 



+ (2 - a)a;~") da;. (18) 



Proof: First, for K = 00, the ccdf of inter-arrival times 



of A"{nt) now satisfies F^^{nx) 



''n-°'i{nx) due 



to its scaling. From E[A"(i7t)] = t, the mean inter-arrival 
time is one. It follows from the underlined assumption that 
ri""^ {p.x) does not scale with N and it is a slowly-varying 
function of x. Applying l24l Theorem 1] yields that A{t)/( 
weakly converges to the process in dTSl i. 

For the rest of cases, (i) K < 00 and (ii) K = 00 and a G 
(2,00), we do not need any assumption because E[f2^] < 00 
holds both for (i) and (ii) by appealing to Theorem |2] These 
finite variance cases were analyzed in ll36l Section 2.1.3(a)] 
whose 'ON/OFF source model' reduces to our model if we 
use /ii = 1 and ^ 1. Remark that it is discussed in l36l 
Section 2.3] that the order of limit operations does not matter 
in these cases. ■ 

The phrase 'as — > 00 and iV 00' is pregnant with 
meaning. The fundamental strength of the above theorem 
for the case of ( lT6b is in that its result is not subject to 
the order of limit operations. Instead, the scaling structure 
between ^ and N, represented by c, determines the kind of 
the approximation in the sense that, as c — > and c — > 00, 
c^/"Yc((^) and c^Y„(^) respectively converges to Aa{t) (a- 
stable Levy motion) and B/f (<) (fractional Brownian motion 
of index H — {3 — a)/2), up to constants ||221. For finite 
c G (0,00), Yq(^) becomes an in-between process. For the 
case of JTTI i. even this scaling structure does not matter 

It is significant that c — > and c ^ 00 respectively 
equivalent to limjv_j.oo lini|j-i.oo and lim^^oo limTV-j-oo in the 
literature. Therefore, the essence of the advance l24l Theorem 
1] is that it has emancipated the limit form of the super- 
position process from the order of the two limit operations, 
widening the applicability of the theory. 

Remark 5 Though, for K = 00, the underlined phrase 
makes a strong assumption which is not reasonable in view 
of a = — (log 7) / log m which heavily depends on N, the 
above theorem deserves its result in the sense that it suggests 
a possible approximation of the backoff process in 802.11, 
based on the state-of-the-art theory. We will come back to the 
preciseness of the approximation later in Remark [T] where we 
observe that a is required to be not too close to 1. 

BdJl As we have discussed in Footnote |5] and ll4Tl Chapter 
9] as well as at the beginning of this section, Poisson approx- 
imation in Theorem [T] is poor on coarse time-scales, i.e., large 
time. Therefore, for short-term fairness analysis, the following 

^Consistency between (TT) and Theorem [T] Suppose K = 00 and 
a £ (2, 00) (which is very unlikely as must be large). Then assume 
the superposition process A{(^Qt) is Poisson. For large ^f2, this Poisson 
process should have a Gaussian marginal distribution with mean N^t and 
variance N(^t, whereas the process U7t has mean N^t and variance v'^N(^t. 
Therefore, Theorem[T]is inconsistent with iili for iiq 7^ 1. The inconsistency 
is due to the premise of Theorem U] i.e., finite time. A similar remark is given 
in (41] Remark 9.8.1]. 
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approximations inspired by (fT6] l and ( fTTj i are essential: 



A{t)K^C-c-Yo,\^-j , for A' = cx), a e (1,2), (19) 

Ait) w ^iVC -vn-Bit), otherwise. (20) 

K[5j2 [Answer to Q5] It turns out that for K ~ oo and a G 
(1,2), the superposition process A{CClt) = Er^=i ^"(C^^^) 
exhibits long-range dependence due to the heavy power tail 
of inter-arrival times il. This process is non-Gaussian and non- 
stable and has stationary, but strongly dependent, increments 
in the sense that it has the same covariance as a multiple of 
fractional Brownian motion of index H ~ (3 — a)/2 1221 . 
It is also shown in 1221 that this process is (both locally and 
globally) asymptotically self-similar though not self-similar 
We believe that networking community has been longing for 
a mathematical evidence which makes extensive simulations 
in ||37l less necessary. 

Turning back to the discussion of inter-transmission proba- 
biUty PAr[z|(^] in Section [VI-AI we demonstrate the strength of 
the above approximations in the following corollaries where 
C is now taken to be number of packets transmitted by the 
tagged node. 

Corollary 1 (Asymp. Inter-Transmission Probability) 

Suppose C > 1 and > 1. If A' = cx) along with a e (1, 2), 
we have 

Pn[Z = z\C]^ / Tc^(^)/^(a;)dx • Lv(y)d2/ 

J -co Jq-{T{y)) 

(21) 

where g±(T(j/)) := - {z T S ^ (N - 1)C ■ T{y)} /((c), S = 
1/2, T(y) := l+C^^""^/"4(C)-y- Here £„{■) is slowly varying 
at infinity, Tc^(-) is the pdf of Yq(t) whose cgf is given 
by ( fTST i. and Lv(-) is the pdf of 8^(1, 1,0) whose index is 

a ~ — (log7)/logm. 

Proof: Under the assumption ^ S> 1 and ^ 1, it 
follows from Theorem that J2n=i ^"(C^^O 

can be approx- 
imated by an Intermediate Telecom process so that its marginal 
distribution takes the form 

P [{N - l)Ct - (cY^it/c) e{z-S,z + S)] 
P[Y^{t/c)e{q-{t),q+m 

f9+(t) rpt/c 



(22) 



In the meantime, it follows from the definition of skewness (3 
and fn{-x) = 0, Vx > that 



1 1. 



Put t = X]j=i %/(C^)- Applying the lesser-known stable- 
law central limit theorem HTl Theorem 4.5.1] to the power 
tailedness result of Theorem [3] taken together with the fact 
;3 = 1, it follows that, for C > 1, 

<~i + C(i-")/"-4(C)-s„(i,i,o). 

Plugging this line into (|22] | yields d^TT i. ■ 



Corollary 2 (Inter- Transmission Probability) 

Suppose C ^ 1 A^ ^ 1. If A' < oo, or AT = oo along 
with a e (2,oo), we have 



Pjv[z|C] ~ Nm 



where the CV uo is given by (|9]l, and Nm(a;) := "^^c"^ 
Proof: Likewise, we have 

« P[(7V-l)a + N(0,«2(A^-l)a) e {z-5,z + 5)\ 
(N - l)Ct ^ 



(23) 



Nm 



(24) 



where N [fi, a^) is the Gaussian random variable with mean 
p and variance a^. Putting t = %/(C^)' t is approxi- 

mated by 

t^i.N(C,4C) = l + ^-N(0,l) 
for C > 1- Thus (EH becomes 

- (AT - 1) {C + VQvnx) 



Nm 



vn^ [N - 1) {C + VQvnx) ^ 



Nm(x)da; 



which is approximated as (|23l l because the denominator 
vn{N — 1)^/^(C + \/Cvnx)^/^ is very large so that the first 
pdf of the integrand is concentrated around z = {N — 
l){C + VCvnx). m 

Remark 6 The derived equations provide us several penetrat- 
ing insights and answers to Q2 as well. Note that the mean 
and variance of ( |2TI ) are given by 

Z:=j:7=o^-PN[z\C]^iN~l)C 
4 (Er=o^'-P^[^IC])-^'«oo, (25) 

while those of ( |23] | are given by 



4- 



(26) 



K[6ll For the case of ( |23] |. we can say that Z is approxi- 
mately Gaussian for large C and A^: 

Z~N((7V-l)C,(Ar- 1)2(^2) (27) 
whereupon the CV of Z can be computed from ( l26b as 

vz -.^az/Z^vn/VC- (28) 

Remarkably, we have derived the most general expression 
of the inter-transmission probability PAr[z|C] while H, lfT2l 
derived the expressions of PAr[2|C] only for N = 2. 

K[6l2 cannot be simplified in general. However, for for 
very large (, hence very small c, it can be easily seen that 
Z has a Levy a-stable distribution. Applying (22! Proposition 
2] to the right-hand side of ( fTSI l yields that it is negligible, 
implying that the inner integral of (|2TI) can be removed. Then 
Z becomes approximately Levian and is expressed in the form 



((A^-i)C^4(C),i,(A^-i)C). 
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Fig. 3. Inter-transmission probability Pjv[2|C] for ^ = 100; K = 6,15; 
and Af = 40, 60. 



This manifests the heavy-tail of Z, i.e., 

P[Z > x] ^ cm ~ l)eoiC)VCa. ■ (29) 
where £q{-) is the same function used in T{y) in (ISTT i and 



Co, 



{a - 1)/ (r(2 - a) sin(7r(a - l)/2)) . 



K[6l3 For the case of K = oo and a E {1, 2), the representa- 
tion ( [29] l reveals the striking similarity between the ccdfs of il 
and Z. In terms of regular variation theory, both are regularly 
varying of index —a, and in Mandelbrot's terminology, Noah 
effect of infiltrates into Z. 

B|6)4 For the case of K = oo, the inter-transmission 
probability bifurcates into two different categories at a = 2 
(or 7 = l/jTi^). Plainly speaking, if 7 < I/jti^, Z can still be 
approximated by the Gaussian distribution in ( l27b . otherwise 
802.11 suffers from extreme unfairness containing precursors 
of power-tailed characteristics such as infinite variances and 
the skewness (/? = 1). 

K[6l5 The skewness induces leaning tendency and direc- 
tional unfairness. The leaning tendency implies the distribu- 
tion is heavily leaning to the left, and the tendency increases 
as a decreases. The directional unfairnes§mplies that while 
the right part of the inter-transmission probability z e {Z, 00) 
has a heavy power tail given by ( |29] l, its left part z e (—00, Z) 
decays faster than exponentially HTl pp.113]. 

We conjecture based on extensive simulations that £{■) in 
(flSl l is approximately a constant, implying that ^o(-) in T(y) 
in (|2TI) is also a constant. Then it follows that the constant £ 
corresponds to the y-intercept of the straight line obtained by 
taking logarithms of ( fTSl l, and can be estimated from Fig. |2] 
After manipulation akin to HTl Theorem 4.5.2], we can show 
a simple relation between them: 

*The Levy a-stable law used in this work has support on the entire real 
line because a £ (1, 2). 



which implies that we need to estimate only £ and a. 

In Fig. [3] the inter-transmission probability obtained through 
ns-2 simulations is compared with the derived formulae of 
Corollaries [T] and |2] for C = 100. It is significant that, for K ~ 
6, Pn[Z ~ z\(] is well approximated by Gaussian formula 
( |23] ) along with ( IFPEll ) and ( IFPE2l i for large A^. This 
forms a striking contrast with the case K = 15 where the 
distribution i2l[ is leaning to the left and its peak is far apart 
from its mean, i.e., Z — {N — 1)C> meaning that there are 
even heavier tails on the right part. Our extensive simulations 
attested to the inevitability of complicated form (l2Tl i. 

Remark 7 Preciseness of the approximation ( fT9] i: remains a 
question due to the underlined assumption of TheoremH) Note 
that A'' is determined by a = — (log 7) / log m provided that 
a is fixed, whereas 1241 Theorem 1] demands that N ^ 00 
provided that a is fixed. Through extensive simulations, we 
have found out that the approximation (1% becomes poor 
as a — )• 1 (or as A^ — )• 00). Under the above simulation 
setting, if A^ > 80, the approximation appears not reasonable. 
A thorough theory addressing this dependence between A^ and 
a is left for future work. 

VII. Wavelet Analysis of Long-Range Dependence 

We provide simulation results to support the argument over 
the long-range dependence in Section [VI-BI under the assump- 
tion K = 00. Recall from Theorem |4] that the time-scaled 
version of the superposition arrival process is approximately 

Am) = Ell A'^m) « - cc • Y„ (i) 

which holds for A^ such that a = —(log 7)/ log to < 2. Note 
that such A^ is to ensure il is strict-sense heavy-tailed (See 
Theorem |3]l. Then by appealing to Il22l . one can show that 
A{((lt) has long-range dependent increments in the sense that 

• A{C,0,t) has the same covariance as a multiple of frac- 
tional Brownian motion of index H {3 — ct)/'2. 

It is easy to see that 1/2 < H < 1 due to 1 < a < 2. 

All simulations obtained from ns-2 simulator use a 437/i 
warm-up period, after which we collected 728/i-long traces. 
To analyze these traces, we use the latest addition to the toolkit 
of inference techniques for long-range dependence, i.e., the 
refined wavelet-based method using Daubechies wavelets with 
M vanishing moments which was proposed by Abry et al. ||2l 
They proposed the first unbiased estimator yj taking the form 

E[y,]=log2(E[4]), 

considering the complication presented by the property 
E[log(-)] 7^ log(E[-]) where dj is called detail processes of 
the wavelet transform. 

The estimates yj of the wavelet spectra over all time-scales 
j, called octaves, are shown in Fig. |4] for K = 6, A' = 15 
and K = 25. Here we fix the other parameters as A^ = 40 
and 1\I ~ 2. Though we present here only the simulation 
results using Daubechies wavelets with M = 2, we obtained 
similar results using Daubechies wavelets with M > 2 and 
Discrete Meyer wavelets. To quantify the integrity of the 
method, Gaussian 95% confidence intervals corresponding to 
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Logscale Diagram, \(j L)=(5, 18). s Est.=0.00, //Est.=0.50] 



Log scale Diagram 



Logscale Diagram, [Opj,)= (12, 18), j Est.=0.66, //Est.=0.83] 




Octave j (time-scale 2-'"^^ backoff time-slots) 

(a) K = 6, N = 40 
Fig. 4. Wavelet spectra using Daubechies wavelets with A/ = 2. 



Octave / (time-scale backoff time— slots) 

(b) X = 15, Af = 40 



Octave J (time— scale 2^"^^ backoff time-slots) 

(c) K = 25, N = 40 



the variability of yj are also shown as the vertical segments 
centered on the estimates yj. Then the measurement of index 
H, called Hurst parameter, is reduced to the identification 
of region of alignment, the determination of the its lower 
and upper cutoff octaves, ji and j2, respectively, and the 
determination of the slope over the alignment region which 
we denote by s. From the slope estimate s, we can obtain the 
estimates of H from the formula 

H (l + s)/2. 



Fig. |4(c)| demonstrates that, for the case K = 25, the 
superposition arrival process possesses a sustained correlation 
structure over a broad range of time-scales j G [1, 18] where 
Sj converges to 0.66 at octave j = 18, whereas, for the case 
iC = 6, it shows a weaker correlation structure over a narrow 
range j G [1,5] as shown in Fig. |4(a)| The estimate of H 
for K = 25 over the alignment region (ji,j2) = (12,18) 
approaches H = 0.83 around (16, 17) which approximately 
matches with analytical formula H ~ {3 — a)/2 ~ 0.90 
where a is obtained from a = (log 7)/ log m, Eqs. dFPElb 
and (IFPE2b . The slope estimate over the alignment region for 
K = 6 is computed as H = 0.50, implying that long-range 
dependence is not observed. A striking observation that can 
be made by comparing Figs. |4(a)| and |4(b)| with Fig. |4(c)| is 
that the per-octave slope sj increases as octave j increases and 
convergent only if K is large enough as in Fig. |4(c)| 

Observation 1 (LRD over coarse times scales) 

Long-range dependence of the superposition process is con- 
spicuous only over coarse time-scales. 

Remark 8 Essentially, there are two reasons behind this phe- 
nomenon which also give us answers to Q5. 

K[8ll Per-node process slows down: It is important to 
recall that, for K = 00, we first established Poisson process 
approximation for the superposition process in Theorem [T] 
meaning that we cannot observe long-range dependence on 
normal time-scales. As is the constant intensity of the super- 
position process for Theorem [T] the constant intensity of the 
component process is essential for Theorem |4] To satisfy the 
latter, we had to consider A"'{(^Clt) instead of A"(i) because 
A"{t) becomes sparser as N 00. That being said, we 
must view the superposition process over coarse time-scales 
Cf2t instead of t to satisfy the premise of Theorem ID which 



explains long-range dependence. 

F[8l2 Additional scaling of time: Another assumption of 
the limit regime considered in ll24l is C — cxj at the same time 
as — > 00. This implies we need additional scaling of time 
to compensate for the scaling of space. 

K[8l3 In practical terms, if the wireless link capacity is 
shared by many nodes, the aggregate transmission process 
is highly invulnerable to long-range dependence for most 
practical K values, essentially due to reduced per-node rate 
and additional time scaling. 

We also conjecture that the above coarser time scalings 
caused the empirical analyses of Veres and Boda ll39l (in 
the context of TCP) and Tickoo and Sikdar 1371 (in the 
context of 802.1 1) not to support long-range dependence of the 
superposition arrival process of TCP sources — they observed 
that if w 0.5 (or s = 0), implying short-range dependence. 
This is because both 802.11 nodes accessing a common base 
station and TCP flows traversing a common bottleneck link 
(i) have similar backoff mechanisms and (ii) reduce (or slow 
down) their transmission rates to share the given capacity as 
the population increases. 

VIII. Concluding Remarks 

Beginning with derivation of per-packet backoff distribution, 
based on which we studied its coefficient of variation that plays 
a key role in formulating short-term fairness in later sections, 
we have conducted a rigorous analysis of the backoff process 
in 802.11 and provided answers to several open questions. 

The power-tail principle states that the per-packet backoff 
has a truncated Pareto-type tail distribution, a simplistic de- 
scription elucidating existing works. This in turn indicates that 
its heavy-tailedness in the strict-sense inherits from collision 
and paves the way for the rest of analysis. The dichotomy 
of aggregation, proven with the aids of a recent advance ll24l 
Theorem 1] in probabilistic community, now tells the whole 
story of contrary limits of the superposition process, i.e., Pois- 
son process and Intermediate Telecom process, emphasizing 
the importance of time-scales on which we view the backoff 
processes. Thanks to the applicability of ll24ll widened by the 
order-free scaling operations of time {() and population (N), 
we identified long-range dependence in 802.11 and discov- 
ered that the inter-transmission probability bifurcates into two 
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categories: either approximately Gaussian or a complicated 
distribution which, under a limiting condition, simplifies to 
Levy a-stable distribution with a G (1,2) possessing strong 
power-tail characteristics. 

Though we have also conducted empirical analysis using 
wavelet-based method to support long-range dependence be- 
havior inherent in 802.11, since we are with Willinger et al. 
Il42l on the point — of cardinal importance is to advance 
our genuine physical understanding applicable to many other 
systems, we believe that the essence of our analysis of long- 
range dependence lies in its mathematical explanation for the 
behavior That is, the heavy-tailed inter-arrival time of each 
per-node transmission process causes long-range dependence 
of the aggregate transmission process at the base station 
though this dependence is seldom observed. 

These results explore the fundamental principles character- 
izing the backoff process in 802.11. Some of them recall to 
our mind the beauty of simplicity, governing the asymptotic 
dynamics of 802.11, and the others form the theoretical 
groundwork of short-term fairness. 
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Appendix 

A. Derivation of ^ 

Plugging bk = m'^feo into vq = (Xn/fj yields 



T - 1. 



(E.,.(6om*^ - 1/2)7^) 
Here the nominator inside the square root is simplified as 

T:= (Eto(^om^-l/2)V(l + ^D) 

+ 2V(6om^-- 1/2)7'= 60^^^— --fc/2 

^-^ \ 771—1 



(30) 



fe=i 



(E(^+ «fe) (&om''^ - 1/2) V) 

\fe=0 / 

if 

^^{bo 1/2)2 _ _ i/2)^^(fc + 



fe=i 



Plugging the last line into dSOl l yields (|9]l. 



B. Proof of Theorem |5] 

Throughout the proof, we denote the sets of real numbers 
(positive real numbers), integers (positive integers), and ra- 
tional numbers by R (IR+), Z (Z+) and Q, to simplify the 
exposition. Denoting the LST of fi{b) by Fi{s), we begin the 
proof by considering the LST of (|4|i: 




Ms) 



This is an infinite sum of the products of 

1 -cxp(-(26o"i' - l)s) 



(31) 



{2bom' 



l)s 



(32) 



that is the LST of the uniform distribution with mean 60m* — 
1/2. For notational simplicity, we adopt the change of variable 
X := 2bos such that x also belongs to R+. Thus we have 



gM-E{n-' '-°-!'ir''?:::{;^''°''^' >. (33) 



(to* - l/(26o))2: 



Since Fi{x) < 1 for x e R+, it is easy to see that G{x) is 
convergent on M+. Then it follows from Bernstein's Theorem 
1201 pp.439] that G{x) is completely monotone. That is, 
G{x) > and it has derivatives of all orders, which satisfy 



(34) 



which implies that the ith derivative of G{x) is strictly 
monotone for all i G 

Step 1: Recursive relation in G{-) 



The crucial observation that paves the way for applying 
the theory of regular variation ifTOl is the following recursive 
relation hidden in the underbraced term of dSTI i: 



where 



H{x) 



G{x)=-fFo{x){l + H{mx)} 



cxp (—(to* — 1/ [2bQm))x^ 
(to* - l/{2bom))x 




(35) 



(36) 



Let a = — (log7)/logTO G M+ and z := \a] £ Z+ which 
designates the smallest integer not less than a. It follows from 
a > that z > 1. Appealing to Theorem |2] and the basic 
property of the LST, i.e., lim 
z e Z+, it follows that 



0+ ^ = i-mm for 



7 



(2bo)-(l"7) 



Recall lim3._j.o+ {x) is finite for i < z by Theorem |2] 

Likewise, it is easy to see that linij._^o+ {^) is finite for i < 
z and infinite for i > z. Moreover, since both the nominator 
and denominator of the limit 



lim 



(37) 



are infinite, we may remove arbitrary number of products 
whose zth derivatives at a; = 0+ are finite from both of G{x) 
and H{x), implying in turn that we may replace the summation 
operation J2T=o < l33T l and ( |36] | by J2T=k' ^'^^ ^'^y ^' — ^■ 
Formally speaking, we have 



k'-l k 



k=0 i=Q 
k'-l 



^ n M E n 



k'-l 



n ^M 




E n ^^*(^) (38) 



where only the third term ( |38] | becomes infinite as x 0+. 
Therefore, we can easily see that the difference between G{x) 
and H{x) vanishes as k' increases and hence ( |37] | must be 1. 

Taking derivatives of both sides of ( l35b z times and after 
some manipulation, it becomes clear that it is sufficient to 
consider only infinite terms which are related to each other in 
the following form: 



h(m) 



i^(TOx) Vt{mx) Vfimx) 

lim . , = lim . , lim 



— z —^ 
m 7 



+ ^(x) .^o+^(tox) 
(39) 

where we also exploited the fact that (l37T i is 1. Because the 
convergence of ( [39] l holds for any real sequences of Xk — > 0^, 
we have that h{y) = y"^^ for y E M where M := {to* | i E 
Z} is a countably infinite set that is nowhere dense in R+. The 
set on which the relation h{y) — y""^ holds is often baptized 
quantifier set in regular variation theory. 



15 



Step 2: Quantifier set is dense in M+ 

We will show that h{y) = y"^^ holds on a dense subset L 
of M+. Define a set 

L := {A e M+ I (log A)/ log m e M\Q} 

where ]R\Q is the set of irrational numbers. It should be clear 
that M and L are disjoint, i.e., M n L = and the set L 
is dense in M+ because it can be rewritten as L = {m'' e 



P+ 



2/G 



Defining 



we can see that T{y,x) is strictly decreasing in y because it 
follows from ( |34] i. i.e., complete monotonicity, that 

Pick A e L in the interval (m*,m*+^) for any z e Z. Since 
T{y,x) > is strictly decreasing in y, it is upper-bounded 
by m'("^^) as x — >■ 0+, meaning that T{y,x) is ultimately 
bounded in x. From its series expansion, it is easy to see that 
it is ultimately monotone in x as a; 0+. Then we can apply 
ll33l Theorem 3.14] to show that there exists a such that 

h{X) = lim^^o+ T(A, x) = A"-^ (40) 

which in turn implies that h{X^) = XJi^-^), Vj £ Z, as (l39]l 
did. Assume that a ^ a. Because T(y, x) is strictly decreasing 
in y, irrespective of z, we have 



for J/ G (1, m). Put y 
can be rearranged as 



<lim,^o+T(y,.T) < 1 (41) 

:= m-Lj(logA)/logmJ;s^j f^j. j ^ j^lis 



and (log A)/ logm is irrational, hence its exponent is on (0, 1) 
and y is on the interval (1, m). We now have from (|39] l and 
(|40li that 

lim^-yo+ T(y, a;) = lim^^o+ 



I 3 1"S 
771 log T7^ 



where the key point is that the second equality follows from 
M n L = 0. Since the last term belongs to the closed interval 

:= [m"-^A^("-"\AJ("~")] 

and (5 ^ a, we must be able to pick j G Z such that does 
not overlap with [m"^^, 1]. This proves by contradiction that 
h{X) = A"-"^ holds for A e L that is dense in M+. 

Step 3: Applying regular variation theory 

Applying the 'Karamata Theorem for monotone functions' 
ifTOl Theorem 1.10.2] to the conclusion we obtained in Step 2 
establishes that %p-{x) is regularly varying (on the right) at 



the origin a; = with index a 
satisfies 



d'G . 



z. Formally speaking, G{s) 
t (i) as s ^ 0+, (42) 



where f {x) is slowly varying at infinity x = oo, i.e., 

lim t (yx) /t (x) = 1 

X— f oo 

for all y G Note that the original Karamata Tauberian 
Theorem in ITOl Theorem 1.7.1] and ll20l pp.445] cannot 
be applied due to the fact a ~ z < 0. These theorems are 
complemented by the modified Karamata Tauberian Theorem 
in ifTOl Theorem 8.1.6] and 15], which we apply to ( |42] | to 
show (fTsT l. 



